- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources5
- Resource Type
-
0005000000000000
- More
- Availability
-
50
- Author / Contributor
- Filter by Author / Creator
-
-
Cheshmi, Kazem (4)
-
Dehnavi, Maryam Mehri (3)
-
Strout, Michelle Mills (3)
-
Kamil, Shoaib (2)
-
Cheshmi, Kazem and (1)
-
Liu, Bangtian (1)
-
Mehri Dehnavi, Maryam (1)
-
Soori, Saeed (1)
-
Strout, Michelle (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
& Ahmed, K. (0)
-
& Ahmed, Khadija. (0)
-
& Aina, D.K. Jr. (0)
-
& Akcil-Okan, O. (0)
-
& Akuom, D. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Dependence between iterations in sparse computations causes inefficient use of memory and computation resources. This paper proposes sparse fusion, a technique that generates efficient parallel code for the combination of two sparse matrix kernels, where at least one of the kernels has loop-carried dependencies. Existing implementations optimize individual sparse kernels separately. However, this approach leads to synchronization overheads and load imbalance due to the irregular dependence patterns of sparse kernels, as well as inefficient cache usage due to their irregular memory access patterns. Sparse fusion uses a novel inspection strategy and code transformation to generate parallel fused code optimized for data locality and load balance. Sparse fusion outperforms the best of unfused implementations using ParSy and MKL by an average of 4.2× and is faster than the best of fused implementations using existing scheduling algorithms, such as LBC, DAGP, and wavefront by an average of 4× for various kernel combinations.more » « less
-
Liu, Bangtian; Cheshmi, Kazem; Soori, Saeed; Strout, Michelle Mills; Dehnavi, Maryam Mehri (, PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)
-
Cheshmi, Kazem and (, SC '18 Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis)
-
Cheshmi, Kazem; Kamil, Shoaib; Strout, Michelle Mills; Dehnavi, Maryam Mehri (, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis)
-
Cheshmi, Kazem; Kamil, Shoaib; Strout, Michelle Mills; Dehnavi, Maryam Mehri (, SC '17 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis)
An official website of the United States government

Full Text Available